Grade of Membership Model and Visualization for RNA-seq data using CountClust
نویسندگان
چکیده
Grade of membership or GoM models (also known as admixture models or Latent Dirichlet Allocation”) are a generalization of cluster models that allow each sample to have membership in multiple clusters. It is widely used to model ancestry of individuals in population genetics based on SNP/ microsatellite data and also in natural language processing for modeling documents [1, 3]. This R package implements tools to visualize the clusters obtained from fitting topic models using a Structure plot [2] and extract the top features/genes that distinguish the clusters. In presence of known technical or batch effects, the package also allows for correction of these confounding effects. CountClust version: 1.2.0 1 This document used the vignette from Bioconductor package DESeq2, cellTree as knitr template
منابع مشابه
Clustering RNA-seq expression data using grade of membership models
Grade of membership models, also known as “admixture models”, “topic models” or “Latent Dirichlet Allocation”, are a generalization of cluster models that allow each sample to have membership in multiple clusters. These models are widely used in population genetics to model admixed individuals who have ancestry from multiple “populations”, and in natural language processing to model documents h...
متن کاملVisualizing the structure of RNA-seq expression data using grade of membership models
Grade of membership models, also known as "admixture models", "topic models" or "Latent Dirichlet Allocation", are a generalization of cluster models that allow each sample to have membership in multiple clusters. These models are widely used in population genetics to model admixed individuals who have ancestry from multiple "populations", and in natural language processing to model documents h...
متن کاملCorrection: Visualizing the structure of RNA-seq expression data using grade of membership models
[This corrects the article DOI: 10.1371/journal.pgen.1006599.].
متن کاملInvestigating the Function of Predicted Proteins from RNA-Seq Data in Holstein and Cholistani Cattle Breeds
This study was performed to determine the digital expression profile of different genes expressed in Holstein and Cholistani breeds as well as to evaluate the performance of predicted proteins derived from differentially expressed genes between these two breeds using RNA-Seq data. For this purpose, the whole mRNA sequence for a blood sample of American Holstein and Pakistani Cholistani cattle p...
متن کاملClustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016